Supporting bi-cluster interpretation in 0/1 data by means of local patterns
نویسندگان
چکیده
Clustering or co-clustering techniques have been proved useful in many application domains. A weakness of these techniques remains the poor support for grouping characterization. As a result, interpreting clustering results and discovering knowledge from them can be quite hard. We consider potentially large Boolean data sets which record properties of objects and we assume the availability of a bi-partition which has to be characterized by means of a symbolic description. Our generic approach exploits collections of local patterns which satisfy some user-defined constraints in the data, and a measure of the accuracy of a given local pattern as a bi-cluster characterization pattern. We consider local patterns which are bi-sets, i.e., sets of objects associated to sets of properties. Two concrete examples are formal concepts (i.e., associated closed sets) and the so-called δ-bi-sets (i.e., an extension of formal concepts towards faulttolerance). We introduce the idea of characterizing query which can be used by experts to support knowledge discovery from bi-partitions thanks to available local patterns. The added-value is illustrated on benchmark data and three real data sets: a medical data set and two gene expression data sets.
منابع مشابه
Towards Constrained Co-clustering in Ordered 0/1 Data Sets
Within 0/1 data, co-clustering provides a collection of biclusters, i.e., linked clusters for both objects and Boolean properties. Beside the classical need for grouping quality optimization, one can also use user-defined constraints to capture subjective interestingness aspects and thus to improve bi-cluster relevancy. We consider the case of 0/1 data where at least one dimension is ordered, e...
متن کاملConstraint-driven co-clustering of 0/1 data
We investigate a co-clustering framework (i.e., a method that provides a partition of objects and a linked partition of features) for binary data sets. So far, constrained co-clustering has been seldomly explored. First, we consider straightforward extensions of the classical instance level constraints (Must-link, Cannot-link) to express relationships on both objects and features. Furthermore, ...
متن کاملClassification in Dynamic Contrast-enhanced Breast MRI: Comparison of Cluster-based approach and BI-RADS Criteria
This study introduces an automatic classification of kinetic patterns on dynamic contrast-enhanced (DCE) breast MR images. We performed k-means cluster analysis technique from real world tumor cases into a reasonable number of reference pattern set that is representative of each cluster. This technique, then, classifies the tumor specific patterns from the given MRI data by measuring vector dis...
متن کاملA Bi-clustering Framework for Categorical Data
Bi-clustering is a promising conceptual clustering approach. Within categorical data, it provides a collection of (possibly overlapping) bi-clusters, i.e., linked clusters for both objects and attribute-value pairs. We propose a generic framework for bi-clustering which enables to compute a bi-partition from collections of local patterns which capture locally strong associations between objects...
متن کاملSituation and Text: Representation of Migrants Whilst the Escalation of Refugee Crisis in Great Britain as Compared to Russia
Increasing migration is a vital concern for a globalizing sociocultural environment in today’s world. The UK and developed European countries have become an attractive destination for asylum seekers (labelled as “migrants”) in the past decade. The rapid rise in the number of asylum seekers, which was labelled “migration crisis” (Ruz, 2015), made this topic an integral part of scientific discuss...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Intell. Data Anal.
دوره 10 شماره
صفحات -
تاریخ انتشار 2006